Method of designing agonists and antagonists to EGF receptor family

ABSTRACT

The present invention relates to a method of designing compounds able to bind to a molecule of the EGF receptor family and to modulate the activity mediated by the receptor molecule based on the 3-D structure coordinates of the EGF receptor crystal of  FIG. 6 .

FIELD OF THE INVENTION

This invention relates to the field of epidermal growth factor (EGF)receptor structure and EGF receptor/ligand interactions. In particular,it relates to the field of using the EGF receptor structure to selectand screen for ligands of the EGF receptor.

BACKGROUND OF THE INVENTION

Epidermal growth factor is a small polypeptide cytokine that stimulatesmarked proliferation of epithelial tissues and is a member of a largerfamily of structurally related cytokines such as transforming growthfactor α (TGFα), amphiregulin, betacellulin, heparin-binding EGF andsome viral gene products. Abnormal EGF family signalling is acharacteristic of certain cancers (Soler, C. & Carpenter, G., 1994 InNicola, N. (ed) “Guidebook to Cytokines and their Receptors”, OxfordUniv. Press, Oxford, pp 194–197; Walker, F. & Burgess, A. W., 1994, InNicola, N. (ed) “Guidebook to Cytokines and their Receptors”, OxfordUniv. Press, Oxford, pp 198–201).

The epidermal growth factor receptor (EGFR) is the cell membranereceptor for EGF (Ullrich, A., and Schlessinger, J. (1990) Cell 61,203–212). The EGFR also binds other ligands that contain amino acidsequences classified as the EGF-like motif. Among these ligands, thethree-dimensional structures of EGF and TGFα have been determined by NMR(Montelione, G. T.; Wuthrich, K.; Nice, E. C., Burgess, A. W. andScheraga, H. A. (1986) PNAS 83(22): 8594–8; Campbell, I. D., Cooke, R.M., Baron, M., Harvey, T. S., and Tappin, M. J. (1989) Prog. GrowthFactor Res. 1, 13–22). Upon binding of the ligand to the extracellulardomain, the EGFR undergoes dimerization, which eventually leads to theactivation of its cytoplasmic protein tyrosine kinase (Ullrich, A., andSchlessinger, J. (1990) Cell 61, 203–212). The EGFR is also known as theErbB-1 receptor and belongs to the type I family of receptor tyrosinekinases (Ullrich, A., and Schlessinger, J. (1990) Cell 61, 203–212).This group also includes the ErbB-2, ErbB-3 and ErbB-4 receptors. Theligand of ErbB-2 is still unknown but it is clear that heregulin bindsto ErbB-3 and ErbB-4 (Plowman, G. D., Green, J. M., Calouscou, J. M.,Carlton, G. W., Rothwell, V. M., and Buckley, S. (1993) Nature 366,473–475). One of the heregulins is known as neuregulin or NDF andcontains an EGF-like sequence that was found to fold into an EGF-likefold by NMR (Nagata, K., Kohda, D., Hatanska, H., Ichikawa, S., Matsuda,S., Yamamoto, T., Suzuki, A., and Inagaki, F. (1994) EMBO J. 13,3517–3523 and Jacobson, N. E., Abadl, N., Sliwkowski, M. X., Reilly, D.,Skelton, N. J., and Fairbrother, W. J. (1996) Biochemistry 36,3402–3417).

The type II family of receptor tyrosine kinases consists of the insulinreceptor (INSR), the insulin-like growth factor I receptor (IGF-1), andthe insulin receptor-related receptor (Ullrich, A., and Schlessinger, J.(1990) Cell 61, 203–212). Although the type II receptors consist of fourchains (α₂β₂), both the extracellular portions of the receptors from thetwo families, as well as the tyrosine kinase portions, share significantsequence homology, suggesting a common evolutionary origin (Ullrich, A.,and Schlessinger, J. (1990) Cell 61, 203–212, and Bajaj, M., Waterfield,M. D., Schlessinger, J., Taylor, W. R., and Blundell, T. (1987) Biochim.Biophys. Acta 916, 220–226).

The 621 amino acid residues of the extracellular domain of the humanEGFR (sEGFR) can be subdivided into four domains as follows: L1, S1, L2and S2, where L and S stand for “large” and “small” domains,respectively (Bajaj, M., Waterfield, M. D., Schlessinger, J., Taylor, W.R., and Blundell, T. (1987) Biochim. Biophys. Acta 916, 220–226, seeFIG. 2). The L1 and L2 domains are homologous, as are the S1 and S2domains.

Ligand-induced dimerization was first reported for the EGF receptor(Schlessinger, J. (1980) Trends Biochem Sci 13, 443–447) and now iswidely accepted as a general mechanism for the transmission of growthstimulatory a signals across the cell membrane. Although manybiochemical experiments have been performed to reveal the molecularmechanism of receptor dimerization (Lemnon, M. A., Bu, Z., Ladbury, J.E., Zhou, M., Pinchasi, D., Lax, L., Engelman, D. M., and Schlessinger,J. (1997) EMBO J. 16, 281–294 and Tzabar, E., Pinkas-Kramarski, R.,Moyer, J. D., Klapper, D. N., Alroy, L., Levkowitz, G., Shelly, M.,Henis, S., Eisenstein, M., Ratzkin, B. J., Sela, M., Andrews, G. C., andYarden, Y. (1997) EMBO J. 16, 4938–4950 and Lax, L., Mitra, A. K.,Ravern, C., Hurwitz, D. R., Rubinstein, M., Ullrich, A., Stroud, R. M.,and Schlessinger, J. (1991), J. Biol. Chem. 266, 13828–13833), themolecular mechanism by which monomeric ligands induce dimerization isstill unknown for members of the EGFR family. Single particle averagingof electron microscopic images suggests that the overall shape of thesEGFR is four-lobed and doughnut-like (Lax, L., Mitra, A. K., Ravern,C., Hurwitz, D. R., Rubinstein, M., Ullrich, A., Stroud, R. M., andSchlessinger, J. (1991), J. Biol. Chem. 266, 13828–13833). Small anglex-ray scattering also indicates that the sEGFR is a flattened spherewith long diameters of 110 Å and a short diameter of 20 Å (Lemmon, M.A., Bu, Z., Ladbury, J. E., Zhou, M., Pinchasi, D., Lax, L., Engelman,D. M., and Schlessinger, J. (1997) EMBO J. 16, 281–294). Thecrystallization of sEGFR in complex with EGF has been published(Günther, N., Betzel, C., and Weber, W. (1990) J. Biol. Chem. 265,22082–22085; Degenhardt M., Weber W., Eschenburg S. Dierks K., Funari SS., Rapp G. and Betzel C. (1998) Acta Crystallogr. D Biol. Crystallogr.54:999–1001), but the structure has not yet been reported, despite adecade of effort by many groups.

One EGF receptor ligand, TGF-α has been observed to be overproduced inkeratinocyte cells which are subject to psoriasis (Turbitt, M. L. etal., 1990, J. Invest. Dermatol. 95(2), 229–232; Higashimyama, M. et al.,1991, J. Dermatol., 18(2), 117–119; Elder, J. T. et al, 1990, 94(1),19–25). The overproduction of at least one other EGF receptor ligand,amphiregulin, has also been implicated in psoriasis. (Piepkorn, M. 1996,Am. J. Dermatopath., 18(2), 165–171). Molecules that inhibit the EGFreceptor have been shown to inhibit the proliferation of both normalkeratinocytes (Dvir, A. et al, 1991, J. Cell Biol., 113(4), 857–865) andpsoriatic keratinocytes. (Ben-Bassat, H. et al., 1995, Exp. Dermatol.,4(2), 82–88). These findings indicate that EGF receptor antagonists maybe useful in the treatment of psoriasis.

Many cancer cells express constitutively active EGFR (Sandgreen, E. P.,et al., 1990, Cell, 61:1121–135; Karnes, W. E. J., et al., 1992,Gastroenterology, 102:474–485) or other EGFR family members (Hynes, N.E., 1993, Semin. Cancer Biol. 4:19–26). Elevated levels of activatedEGFR occur in bladder, breast, lung and brain tumours (Harris, A. L., etal., 1989, In Furth & Greaves (eds) The Molecular Diagnostics of humancancer. Cold Spring Harbor Lab. Press, CSH, NY, pp 353–357). Antibodiesto EGFR can inhibit ligand activation of EGFR (Sato, J. D., et al., 1983Mol. Biol. Med. 1:511–529) and the growth of many epithelial cell lines(Aboud-Pirak E., et al., 1988, J. Natl Cancer Inst. 85:1327–1331).Patients receiving repeated doses of a humanised chimeric anti-EGFRmonoclonal antibody (Mab) showed signs of disease stabilization. Thelarge doses required and the cost of production of humanised Mab islikely to limit the application of this type of therapy. These findingsindicate that the development of EGF receptor antagonists will beattractive anticancer agents.

SUMMARY OF THE INVENTION

The present inventors have now obtained three-dimensional structuralinformation concerning the epidermal growth factor receptor (EGFR). Thisstructural information was obtained by comparative modelling based onthe three-dimensional structure of the IGF-1 receptor as described inPCT/AU98/00998. The information presented in the present application canbe used to predict the structure of related members of the EGF receptorfamily, and to develop specific ligands of members of the EGF receptorfamily for therapeutic applications.

Accordingly, in a first aspect the present invention provides a methodof designing a compound which binds to a molecule of the EGF receptorfamily and modulates an activity mediated by the molecule, which methodcomprises the step of assessing the stereochemical complementaritybetween the compound and a topographic region of the molecule, whereinthe molecule is characterised by

-   -   (i) amino acids 1–621 of the EGF receptor positioned at atomic        coordinates substantially as shown in FIG. 6;    -   (ii) one or more subsets of said amino acids related to the        coordinates shown in FIG. 6 by whole body translations and/or        rotations; or    -   (iii) amino acids present in the amino acid sequence of a member        of the EGF receptor family, which form an equivalent        three-dimensional structure to that of the receptor site defined        by amino acids 1–621 of the EGF receptor positioned at atomic        coordinates substantially as shown in FIG. 6.

In a preferred embodiment of the first aspect, the topographic region ofthe molecule is defined by amino acids 1475 of the EGF receptor, or anamino acid sequence which forms an equivalent three-dimensionalstructure to that of the region defined by amino acids 1–475 of the EGFreceptor positioned at atomic coordinates substantially as shown in FIG.6.

In a further preferred embodiment of the first aspect, the topographicregion of the molecule is defined by amino acids 313–621 of the EGFreceptor, or an amino acid sequence which forms an equivalentthree-dimensional structure to that of the region defined by amino acids313–621 of the EGF receptor positioned at atomic coordinatessubstantially as shown in FIG. 6.

The phrase “EGF receptor family” includes, but is not limited to, theEGF receptor, ErbB2, ErbB3 and ErbB4. In general, EGF receptor familymolecules show similar domain arrangements and share significantsequence identity, preferably at least 40% identity.

The EGF receptor molecule defined in the first aspect of the presentinvention is depicted in FIG. 5. The fragment comprising residues 1–475of the receptor comprises the L1, S1 and L2 domains of the ectodomain ofthe EGF receptor. At the centre of this structure is a cavity, boundedby all three domains, of sufficient size to accommodate a ligandmolecule.

The fragment comprising residues 313–621 comprises the L2 and S2domains, which are positioned such that they form a “corner” structure.It is envisaged that this corner structure provides a further bindingsite for ligands of EGF receptor family members.

By “stereochemical complementarity” we mean that the substance or aportion thereof correlates, in the manner of the classic “lock-and-key”visualisation of ligand-receptor interaction, with the cavity in thereceptor site.

In a preferred embodiment of the first aspect of the present invention,the method further involves selecting or designing a compound which hasportions that match residues positioned on the surface of the receptorsite as depicted in FIGS. 7, 8 and 9. By “match” we mean that theidentified portions interact with the surface residues, for example, viahydrogen bonding or by enthalpy-reducing Van der Waals interactionswhich promote desolvation of the biologically active compound within thesite, in such a way that retention of the compound within the cavity isfavoured energetically.

In a further preferred embodiment of the first aspect of the presentinvention, the method includes screening for, or designing, a compoundwhich possesses a stereochemistry and/or geometry which allows it tointeract with both the L1 and L2 domains of the receptor site. It isbelieved that EGFR monomers may dimerise in nature in such a manner thatthe cavities of each monomer may face each other. Accordingly, themethod of the first aspect of the present invention may involvescreening for, or designing, a biologically active compound whichinteracts with the L1 domain of one monomer and the L2 domain of theother monomer.

In a further preferred embodiment of the first aspect of the presentinvention the compound interacts with a fragment in the region of the L1domain-S1 domain interface, causing an alteration in the positions ofthe domains relative to each other. Preferably, the interaction of thecompound causes the L1 and S1 domains to move away from each other. In afurther preferred embodiment the compound interacts with the hingeregion between the S1 domain and the L2 domain causing an alteration inthe positions of these domains relative to each other. In a furtherpreferred embodiment the compound interacts with the β sheet of the L1domain causing an alteration in the position of the L1 domain relativeto the position of the S1 domain or L2 domain.

In a further preferred embodiment, the compound binds to a lower face(according to orientations shown in FIGS. 3 and 4) containing the secondα-sheet of the L1 and/or L2 domains, wherein the structure of the faceis characterised by a plurality of solvent-exposed hydrophobic residues.Examples of these hydrophobic residues include Tyr64, Leu66, Tyr89,Tyr93 (see FIG. 7), Leu348, Phe380 and Phe412 (see FIG. 10).

In a further preferred embodiment the compound interacts with the hingeregion between the L2 domain and S2 domains, causing an alteration inthe positions of the L1 and L2 domains relative to each other.Preferably, the interaction of the compound causes the L1 and L2 domainsto move away from each other.

In a further preferred embodiment the compound interacts with the βsheet of the L2 domain causing an alteration in the position of the L2domain relative to the position of the L1 domain.

In a further preferred embodiment of the present invention, thestereochemical complementarity is such that the compound has a K_(d) forthe receptor site of less than 10⁻⁶M. More preferably, the K_(d) valueis less than 10⁻⁸M and more preferably less than 10⁻⁹M.

In preferred embodiments of the first aspect of the present invention,the compound is selected or modified from a known compound identifiedfrom a data base.

In one embodiment of the first aspect, the compound has the ability toincrease an activity mediated by the molecule of the EGF receptorfamily.

In another embodiment, the compound has the ability to decrease anactivity mediated by the molecule of the EGF receptor family.Preferably, the stereochemical interaction between the compound and thereceptor site is adapted to prevent the binding of a natural ligand ofthe molecule of the EGF receptor family to the receptor site.Preferably, the compound has a K, of less than 10⁻⁶M, more preferablyless than 10⁻⁸M and more preferably less than 10⁻⁹M.

In a second aspect the present invention provides computer-assistedmethod for identifying potential compounds able to bind to a molecule ofthe EGF receptor family and to modulate an activity mediated by themolecule, using a programmed computer comprising a processor, an inputdevice, and an output device, comprising the steps of:

-   -   (a) inputting into the programmed computer, through the input        device, data comprising the atomic coordinates of the EGF        receptor molecule as shown in FIG. 6, or a subset thereof;    -   (b) generating, using computer methods, a set of atomic        coordinates of a structure that possesses stereochemical        complementarity to the atomic coordinates of the EGF receptor        site as shown in FIG. 6, or a subset thereof, thereby generating        a criteria data set;    -   (c) comparing, using the processor, the criteria data set to a        computer database of chemical structures;    -   (d) selecting from the database, using computer methods,        chemical structures which are similar to a portion of said        criteria data set; and    -   (e) outputting, to the output device, the selected chemical        structures which are similar to a portion of the criteria data        set.

In a preferred embodiment of the second aspect, the method is used toidentify potential compounds which have the ability to decrease anactivity mediated by the receptor.

In a further preferred embodiment of the second aspect, the methodfurther comprises the step of selecting one or more chemical structuresfrom step (e) which interact with the receptor site of the molecule in amanner which prevents the binding of natural ligands to the receptorsite.

In a further preferred embodiment of the second aspect, the methodfurther comprises the step of obtaining a compound with a chemicalstructure selected in steps (d) and (e), and testing the compound forthe ability to decrease an activity mediated by the receptor.

In a further preferred embodiment of the second aspect, the method isused to identify potential compounds which have the ability to increasean activity mediated by the receptor molecule.

In a further preferred embodiment of the second aspect, the methodfurther comprises the step of obtaining a molecule with a chemicalstructure selected in steps (d) and (e), and testing the compound forthe ability to increase an activity mediated by the receptor molecule.

The present invention also provides a method of screening of a putativecompound having the ability to modulate the activity of a molecule ofthe EGF receptor family, comprising the steps of identifying a putativecompound by a method according to the first or second aspects, andtesting the compound for the ability to increase or decrease an activitymediated by the molecule. In one embodiment, the test is carried out invitro. Preferably, the in vitro test is a high throughput assay. Inanother embodiment, the test is carried out in vivo.

In a third aspect the present invention provides a compound able to bindto a molecule of the EGF receptor family and to modulate an activitymediated by the molecule, the compound being obtained by a methodaccording to the present invention.

In a preferred embodiment of the third aspect, the compound is a mutantligand of a molecule of the EGF receptor family, where at least onemutation occurs in the region of the ligand which interacts withresidues on the surface of the receptor site facing toward the cavity.For example, the residues Arg 41 and Tyr 13 in EGF are conserved inother members of the EGF receptor family of ligands (a Phe residue maybe substituted for Tyr 13). Structures of several EGF family membersshow the two residues to be in close proximity (Groenen, L. C., Nice, E.C., Burgess, A. W., 1994, Growth Factors 11:235–257). This portion ofEGF may interact with a hydrophobic portion of the EGF receptor whichcontains one or more negatively charged residues such as the lower βsheet of the L1 domain. Mutants of EGF which show altered activity maybe generated by introducing modifications to Arg 41 or Tyr 13 or othernearby residues. Alternatively, mutants of EGF may be generated byintroducing modifications to residues on the opposite side of the ligandwhich may interact with a second receptor molecule in the unmodifiedligand.

In a fourth aspect the present invention provides a compound whichpossesses stereochemical complementarity to a topographic region of amolecule of the EGF receptor family and modulates an activity mediatedby the molecule, wherein the molecule is characterised by

-   -   (i) amino acids 1–621 of the EGF receptor positioned at atomic        coordinates substantially as shown in FIG. 6;    -   (ii) one or more subsets of said amino acids, related to the        coordinates shown in FIG. 6 by whole body translations and/or        rotations; or    -   (iii) amino acids present in the amino acid sequence of a member        of the EGF receptor family, which form an equivalent        three-dimensional structure to that of the receptor site defined        by amino acids 1–621 of the EGF receptor positioned at atomic        coordinates substantially as shown in FIG. 6;    -   with the proviso that the compound is not a naturally occurring        ligand of a molecule of the EGF receptor family or a mutant        thereof.

By “mutant” we mean a ligand which has been modified by one or morepoint mutations, insertions of amino acids or deletions of amino acids.

In a preferred embodiment of the fourth aspect, the topographic regionof the molecule is defined by amino acids 1–475 of the EGF receptor oran amino acid sequence which forms an equivalent three-dimensionalstructure to that of the region defined by amino acids 1–475 of the EGFreceptor positioned at atomic coordinates substantially as shown in FIG.6.

In a further preferred embodiment of the fourth aspect, the topographicregion of the molecule is defined by amino acids 313–621 of the EGFreceptor or an amino acid sequence which forms an equivalentthree-dimensional structure to that of the region defined by amino acids313–621 of the EGF receptor positioned at atomic coordinatessubstantially as shown in FIG. 6.

In preferred embodiments of the third and fourth aspects, thestereochemical complementarity between the compound and the receptorsite is such that the compound has a K_(d) for the receptor site of lessthan 10⁻⁶M, more preferably less than 10⁻⁸M.

In some embodiments of the third and fourth aspects, the compoundincreases an activity mediated by the EGF receptor.

In other embodiments of the third and fourth aspects, the compounddecreases an activity mediated by the EGF receptor.

In a fifth aspect, the present invention provides a pharmaceuticalcomposition for preventing or treating a disease which would benefitfrom increased signalling by a molecule of the EGF receptor family,which comprises a compound according to the third or fourth aspects ofthe present invention and a pharmaceutically acceptable carrier ordiluent.

In a sixth aspect, the present invention provides a pharmaceuticalcomposition for preventing or treating a disease associated withsignalling by a molecule of the EGF receptor family which comprises acompound according to the third or fourth aspects of the presentinvention and a pharmaceutically acceptable carrier or diluent.

In a seventh aspect the present invention provides a method ofpreventing or treating a disease which would benefit from increasedsignalling by a molecule of the EGF receptor family which methodcomprises administering to a subject in need thereof a compoundaccording to the third or fourth aspects of the present invention.Preferably, the disease is selected from wound healing and gastriculcers.

In an eighth aspect the present invention provides a method ofpreventing or treating a disease associated with signalling by amolecule of the EGF receptor family which method comprises administeringto a subject in need thereof a compound according to the third or fourthaspects of the present invention. Preferably, the disease is selectedfrom psoriasis and tumour states comprising but not restricted to cancerof the breast, brain, ovary, cervix, pancreas, lung, head and neck, andmelanoma, rhabdomyosarcoma, mesothelioma and glioblastoma.

Throughout this specification, the word “comprise”, or variations suchas “comprises” or “comprising”, will be understood to imply theinclusion of a stated element, integer or step, or group of elements,integers or steps, but not the exclusion of any other element, integeror step, or group of elements, integers or steps.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: Sequence alignment of human EGF receptor family proteins withIGF-1 receptor sequences and insulin receptor sequence for the first twodomains of the EGF receptor. The alignment of the EGF receptor and thevarious IGF-1 receptor sequences were used by the MODELLER program tocreate a model of the EGF receptor domains L1 and S1. Residues which areunderlined were used to create additional Cα-Cα restraints for theconstruction of the EGF receptor model. Disulfide bonds are alsoindicated by lines between cysteine residues. The modules of the EGFreceptor S1 domain are numbered.

FIG. 2: Sequence alignment of human EGF receptor family proteins withIGF-1 receptor sequences and insulin receptor sequence for the third andfourth domains of the EGF receptor. Additional labels and lines aresimilar to those in FIG. 1.

FIG. 3: Model polypeptide fold of the L1 and S1 domains of the EGFreceptor. The L1 is at the left hand side of the structure with theN-terminus facing the front. Cysteine residue sidechains are depicted assticks.

FIG. 4: Model polypeptide fold of the L2 and S2 domains of the EGFreceptor. The L2 is at the bottom of the structure with the N-terminusfacing the front. Cysteine residue sidechains are depicted as sticks.

FIG. 5: Superposition of the two models (of the L1 and S1 domain and ofL2 and S2 domains) onto the structure of the first three domains of theIGF-1 receptor. Cysteine residue sidechains are depicted as sticks.Selected residues are shown as spheres and labelled.

FIG. 6: Coordinates of the two models of the EGF receptor extracellulardomain. The first model (6A-1 through top half of 6A-31) consists of thedomains L1 and S1. The second model (6A-31 (bottom half) through 6A-62)consists of the domains L2 and S2. The coordinates are in relation to aCartesian set of orthogonal axes. The L1, S1 and L2 domains of the EGFreceptor models have been superimposed on the crystal structure of theIGF-1 receptor domains L1, cysteine-rich domain and L2. The final columncontains the number 20, 40 or 60, depending on whether the residuecontaining the atom is judged to be well-modeled, have a moderatepossibility of error, or is likely to be inaccurate, respectively.

FIG. 7: Part of the model polypeptide fold of the L1 and S1 domains ofthe EGF receptor. Side chains of residues from the L1 domain which facetowards the large cavity (shown in FIG. 5) are shown in ball and sticknotation and labelled with residue number and the one letter code.

FIG. 8: Part of the model polypeptide fold of the L1 and S1 domains ofthe EGF receptor. Side chains of residues from the S1 domain which facetowards the large cavity (shown in FIG. 5) are shown in ball and sticknotation and labelled using the one letter code.

FIG. 9: Part of the model polypeptide fold of the L2 and S2 domains ofthe EGF receptor. Side chains of residues from the L2 domain which facetowards the large cavity (shown in FIG. 5) are shown in ball and sticknotation and labelled using the one letter code:

FIG. 10: Part of the model polypeptide fold of the L2 and S2 domains ofthe EGF receptor. Solvent exposed residues from the face of the L2domain containing the large β sheet are shown in ball and stickrepresentation.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS OF THE INVENTION

The present inventors have developed three dimensional structuralinformation about the EGF receptor to enable a more accurateunderstanding of how the binding of ligand leads to signal transduction.Such information provides a rational basis for the development ofligands for specific therapeutic applications, something that heretoforecould not have been predicted de novo from available sequence data.

The precise mechanisms underlying the binding of agonists andantagonists to the EGF receptor are not fully clarified. However, thebinding of ligands to the receptor site, preferably with an affinity inthe order of 10⁻⁸M or higher, is understood to arise from enhancedstereochemical complementarity relative to naturally occurring EGFreceptor ligands.

Such stereochemical complementarity, pursuant to the present invention,is characteristic of a molecule that matches intra-site surface residueslining the groove of the receptor site as enumerated by the coordinatesset out in FIG. 6. The residues lining the groove are depicted in FIGS.7, 8 and 9. By “match” we mean that the identified portions interactwith the surface residues, for example, via hydrogen bonding or byenthalpy-reducing Van der Waals interactions which promote desolvationof the biologically active compound within the site, in such a way thatretention of the biologically active compound within the groove isfavoured energetically.

Substances which are complementary to the shape of the receptor sitecharacterised by amino acids positioned at atomic coordinates set out inFIG. 6 may be able to bind to the receptor site and, when the binding issufficiently strong, substantially prohibit binding of the naturallyoccurring ligands to the site.

It will be appreciated that it is not necessary that the complementaritybetween ligands and the receptor site extend over all residues liningthe groove in order to inhibit binding of the natural ligand.Accordingly, agonists or antagonists which bind to a portion of theresidues lining the groove are encompassed by the present invention.

In general, the design of a molecule possessing stereochemicalcomplementarity can be accomplished by means of techniques thatoptimize, either chemically or geometrically, the “fit” between amolecule and a target receptor. Known techniques of this sort arereviewed by Sheridan and Venkataraghavan, Acc. Chem Res. 1987 20 322;Goodford, J. Med. Chem. 1984 27 557; Beddell, Chem. Soc. Reviews 1985,279; Hol, Angew. Chem. 1986 25 767 and Verlinde C. L. M. J & Hol, W. G.J. Structure 1994, 2, 577, the respective contents of which are herebyincorporated by reference. See also Blundell et al., Nature 1987 326 347(drug development based on information regarding receptor structure).

Thus, there are two preferred approaches to designing a molecule,according to the present invention, that complements the shape of theEGF receptor. By the geometric approach, the number of internal degreesof freedom (and the corresponding local minima in the molecularconformation space) is reduced by considering only the geometric(hard-sphere) interactions of two rigid bodies, where one body (theactive site) contains “pockets” or “grooves” that form binding sites forthe second body (the complementing molecule, as ligand). The secondpreferred approach entails an assessment of the interaction ofrespective chemical groups (“probes”) with the active site at samplepositions within and around the site, resulting in an array of energyvalues from which three-dimensional contour surfaces at selected energylevels can be generated.

The geometric approach is illustrated by Kuntz et al., J. Mol. Biol.1982 161 269, the contents of which are hereby incorporated byreference, whose algorithm for ligand design is implemented in acommercial software package distributed by the Regents of the Universityof California and further described in a document, provided by thedistributor, which is entitled “Overview of the DOCK Package, Version1.0,”, the contents of which are hereby incorporated by reference.Pursuant to the Kuntz algorithm, the shape of the cavity represented bythe EGF receptor site is defined as a series of overlapping spheres ofdifferent radii. One or more extant databases of crystallographic data,such as the Cambridge Structural Database System maintained by CambridgeUniversity (University Chemical Laboratory, Lensfield Road, CambridgeCB2 1EW, U.K.) and the Protein Data Bank maintained by BrookhavenNational Laboratory (Chemistry Dept. Upton, N.Y. 11973, U.S.A.), is thensearched for molecules which approximate the shape thus defined.

Molecules identified in this way, on the basis of geometric parameters,can then be modified to satisfy criteria associated with chemicalcomplementarity, such as hydrogen bonding, ionic interactions and Vander Waals interactions.

The chemical-probe approach to ligand design is described, for example,by Goodford, J. Med. Chem. 1985 28 849, the contents of which are herebyincorporated by reference, and is implemented in several commercialsoftware packages, such as GRID (product of Molecular Discovery Ltd.,West Way House, Elms Parade, Oxford OX2 9LL, U.K.). Pursuant to thisapproach, the chemical prerequisites for a site-complementing moleculeare identified at the outset, by probing the active site (as representedvia the atomic coordinates shown in FIG. 1) with different chemicalprobes, e.g., water, a methyl group, an amine nitrogen, a carboxyloxygen, and a hydroxyl. Favored sites for interaction between the activesite and each probe are thus determined, and from the resultingthree-dimensional pattern of such sites a putative complementarymolecule can be generated.

Programs suitable for searching three-dimensional databases to identifymolecules bearing a desired pharmacophore include: MACCS-3D and ISIS/3D(Molecular Design Ltd., San Leandro, Calif.), ChemDBS-3D (ChemicalDesign Ltd., Oxford, U.K.), and Sybyl/3 DB Unity (Tripos Associates, StLouis, Mo.).

Programs suitable for pharmacophore selection and design include: DISCO(Abbott Laboratories, Abbott Park, Ill.), Catalyst (Bio-CAD Corp.,Mountain View, Calif.), and ChemDBS-3D (Chemical Design Ltd., Oxford,U.K.).

Databases of chemical structures are available from a number of sourcesincluding Cambridge Crystallographic Data Centre (Cambridge, U.K.) andChemical Abstracts Service (Columbus, Ohio).

De novo design programs include Ludi (Biosym Technologies Inc., SanDiego, Calif.), Sybyl (Tripos Associates) and Aladdin (Daylight ChemicalInformation Systems, Irvine, Calif.).

Those skilled in the art will recognize that the design of a mimetic mayrequire slight structural alteration or adjustment of a chemicalstructure designed or identified using the methods of the invention.

The invention may be implemented in hardware or software, or acombination of both. However, preferably, the invention is implementedin computer programs executing on programmable computers each comprisinga processor, a data storage system (including volatile and non-volatilememory and/or storage elements), at least one input device, and at leastone output device. Program code is applied to input data to perform thefunctions described above and generate output information. The outputinformation is applied to one or more output devices, in known fashion.The computer may be, for example, a personal computer, microcomputer, orworkstation of conventional design.

Each program is preferably implemented in a high level procedural orobject-oriented programming language to communicate with a computersystem. However, the programs can be implemented in assembly or machinelanguage, if desired. In any case, the language may be compiled orinterpreted language.

Each such computer program is preferably stored on a storage medium ordevice (e.g., ROM or magnetic diskette) readable by a general or specialpurpose programmable computer, for configuring and operating thecomputer when the storage media or device is read by the computer toperform the procedures described herein. The inventive system may alsobe considered to be implemented as a computer-readable storage medium,configured with a computer program, where the storage medium soconfigured causes a computer to operate in a specific and predefinedmanner to perform the functions described herein.

Compounds designed according to the methods of the present invention maybe assessed by a number of in vitro and in vivo assays of hormonefunction. For example, the identification of EGF receptor antagonists ofmay be undertaken using a solid-phase receptor binding assay. Potentialantagonists may be screened for their ability to inhibit the binding ofeuropium-labelled EGF receptor ligands to soluble, recombinant EGFreceptor in a microplate-based format. Europium is a lanthanidefluorophore, the presence of which can be measured using time-resolvedfluorometry. The sensitivity of this assay matches that achieved byradioisotopes, measurement is rapid and is performed in a microplateformat to allow high-sample throughput, and the approach is gaining wideacceptance as the method of choice in the development of screens forreceptor agonists/antagonists (see Apell et. al. J. Biomolec. Screening3:19–27, 1998 Inglese et. al. Biochemistry 37:2372–2377, 1998).

Binding affinity and inhibitor potency may be measured for candidateinhibitors using biosensor technology.

The EGF receptor antagonists may be tested for their ability to modulatereceptor activity using a cell-based assay incorporating a stablytransfected, EGF-responsive reporter gene (Souriau, C., Fort, P., Roux,P., Hartley, O., Lefranc, M-P., Weill, M., 1997, Nucleic Acids Res.25:1585–1590). The assay addresses the ability of EGF to activate thereporter gene in the presence of novel ligands. It offers a rapid(results within 6–8 hours of hormone exposure), high-throughput (assaycan be conducted in a 96-well format for automated counting) analysisusing an extremely sensitive detection system (chemiluminescence). Oncecandidate compounds have been identified, their ability to antagonisesignal transduction via the EGF-R can be assessed using a number ofroutine in vitro cellular assays such as inhibition of EGF-mediated cellproliferation. Ultimately, the efficiency of antagonist as a tumourtherapeutic may be tested in vitro in animals beating tumour isograftsand xenografts as described (Rockwell, P., O'Connor, W. J., King, K.,Goldstein, N. I., Zhang, L. M., Stein, C. A., 1997, Proc Natl Acad SciUSA 94:6523–6528; Prewett, M., Rothman, M., Waksal, H., Feldman, M.,Bander, N. H., Hicklin, D. J., 1998 Clin Cancer Res 4:2957–2966).

Tumour growth inhibition assays may be designed around a nude mousexenograft model using a range of cell lines. The effects of the receptorantagonists and inhibitors may be tested on the growth of subcutaneoustumours.

Comparative Modelling

The comparative modelling method exploits the observation that proteinswith more than 25% amino acid identity will almost always have a similarprotein backbone (Sander, C. And Schneider, R., 1991, Proteins:Structure Function and Genetics, 9, 56–68). In some cases, proteins willhave similar backbone structures with a lower proportion of identicalamino acids. By aligning the sequence of a (target) protein which is tobe modelled with the sequences with known structures (the templates), amodel of the protein can be obtained. Where a region of the targetsequence follows the sequences of a template, the backbone of the targetis built to follow that of the template. Where the target sequence cannot be aligned to a target sequence, the so-called insertion must beconstructed by other means (Greer, J., 1991, Meth. Enzym. pp 239–252).

The MODELLER program ({hacek over ((S)}ali, A and Blundell, T. L., 1993,J. Mol. Biol. 234, 779–815) is a semi-automated approach to buildingmodels of proteins given the structures of one or more templatestructures and an alignment between the sequences of the target proteinand the templates. Based on the sequence alignment and a set of rulesderived from the analysis of sets of aligned structure, the programgenerates a series of restraints for variables such as Cα-Cα distances,main chain and side chain dihedral angles for the target structure. Therestraints are expressed in terms of probability density functions(PDFs). The PDFs are combined to yield an expression for the mostprobable structure as a function of the variables (Cα-Cα distances etc).The program then attempts to find structures to maximise the value ofthis function. In effect, the program attempts to minimise a transformedversion of this function.

While some comparative modelling approaches involve the explicitbuilding of regions of the model for which there is no sequencealignment with a template, the MODELLER program constructs PDFs forthese regions, thus including them in the consideration of constructinga comparative model. It is conceivable that once a comparative model hasbeen constructed using MODELLER than an algorithm to build thestructures of these regions is applied.

The MODELLER program was used to build the structures of theextracellular portion of the EGF receptor using the 3D structure of theIGF-1 receptor (as described in PCT/AU98/00998) as a template. Thedescription of the generation of these models is outlined below.

Construction of the Alignment

The region of the IGF-1 receptor whose structure is known (Garrett, T.P., McKern, N. M., Lou, M., Frenkel, M. J., Bentley, J. D., Lovrecz, G.O., Elleman, T. C., Cosgrove, L. J., Ward, C. W., 1998 Nature394:395–399) consists of three domains, the L1 domain, cysteine-richdomain (CRD) and the L2 domain (in order of increasing residue number).The L1 and L2 domains adopt similar folds, each consisting of asingle-stranded right-hand β-helix. The helix contains three β-sheetswhich make up the left and right sides and the bottom of the β-helix.The top is less regular. This type of β-helix has been dubbed a“breadloaf”. The cysteine-rich domain (CRD) consists of eight smallmodules, each of which has one or two disulfide bonds. The first threemodules of the CRD contain two disulfide bonds which have a Cys1–Cys3and Cys2–Cys4 disulfide pairing arrangement. The next four have a singledisulfide bond with a so-called β-finger structure. The eighth module ofthe CRD contains one disulfide bond but is not a α-finger.

The sequence of the EGF receptor extracellular domain can be dividedinto four domains, L1, S1, L2 and S2 (in order of increasing residuenumber) on the basis of internal homology and homology with the insulinreceptor family (Ward, C. W., Hoyne, P. A., Flegg, R. H., 1995, Proteins22:141–153; Bajaj, M., Waterfield, M. D., Schlessinger, J., Taylor, W.R., Blundell, T., 1987, Biochim Biophys Acta 916:220–226). The L1 and L2domains are similar in sequence to each other and to the L1 and L2domains in the IGF-1 receptor. The S1 and S2 domains are similar insequence and also similar to the CRD of the IGF-1 receptor. These threedomains contain a large number of cysteine residues, although the S2domain of the EGF receptor has two less cysteine residues than does theCRD of the IGF-1 receptor and the S1 domain of the EGF receptor.

Two important sequence motifs are found in the EGF receptor sequencewhich are conserved in other EGF receptor homologues. The first motif isthe sequence CXXXXXXW which is found near the end of the sequences ofthe L1 and L2 domains of the EGF receptor and its homologues where C iscysteine and W is tryptophan. (The motif in the L1 domain of the EGFreceptor consists of C133–W140 and in the L2 domain consists ofC446–W453.) The second motif is the sequence CW which occurs near thestart of the S1 and S2 domains of the EGF receptor (C175–W176 in the S1domain and C491–W492 in the S2 domain). The two motifs also occur in theinsulin receptor family (C120XXXXXXW127 and C175W176 in IGF-1 receptor)in the L1 domain and cysteine-rich domain respectively. In contrast tothe EGF receptor and its homologues the first of these two motifs doesnot occur in the L2 domain of the insulin receptor family. Structurally,the first motif corresponds to part of the L1 domain which allowspenetration of the tryptophan residue of the second motif into theβ-helix. As the first sequence motif is absent from the L2 domain of theIGF-1 receptor, very little of the structure of this domain was used asa template in the modelling of the EGF receptor.

Construction of the Alignment of L1 and S1

As the L1 domain of the IGF-1 receptor has a defined core, the sequencealignment was manually constructed with a view to placing most of theconserved hydrophobic residues of the EGF receptor such that their sidechains point towards the β-helical core. Homologues of the EGF receptorwere included in the alignment to assist with the identification of suchresidues (FIG. 1). Other IGF-1 receptor residues whose positions wereconserved were the four cysteine residues in the L1 domain and theresidues Arg 77, Trp 127, Trp 176 and Gln 182. Two small regions of theIGF-1 receptor were also included in the alignment. The first of theseregions includes the sequence Ser 375–Lys 380 from the L2 domain of theIGF-1 receptor and is used as a template for modelling the EGF receptorresidues Asp 51–Lys 56. Additional flanking residues were also used.Residues Ile 385–Phe 397 of the IGF-1 receptor were also used as atemplate to better model the EGF receptor residues Ile 83–Leu 95 (FIG.1).

The alignment of the S1 domain of the EGF receptor to the cysteine-richdomain of the IGF-1 receptor used the same combination of modules. Allof the putative modules of the EGF receptor S1 domain were aligned topart or all of the corresponding module of the CRD of the IGF-1receptor. The third module of the IGF-1 receptor CRD (Cys 201–Cys 218)was used as an additional template to the first (Cys 166–Cys 183) andsecond (Cys 191–Cys 207) putative modules of the EGF receptor S1 domain.The residues Cys 230–Cys 246 of IGF-1 receptor, which include theprotein's fifth module, were aligned to the EGF receptor residues Cys267–Cys 283 (which include the EGF receptor S1 domain's putative sixthmodule).

Construction of the Alignment of L2 and S2

Construction of the alignment of the sequence of the L2 domain of theEGF receptor to the sequence of the L1 domain of the IGF-1 receptorfollowed similar principles to that of the alignment of the L1 domain ofthe EGF receptor. The region Ile 385–Phe 397 of the IGF-1 receptorserved as an additional template and its sequence was aligned to Ile402–Leu 414 of the EGF receptor (FIG. 2).

An analysis of α-finger modules in the IGF-1 receptor, TNF receptor andthe laminin-γ structures revealed that these modules could be classifiedinto three types exhibiting some structural and sequence conservation.Two of the structural types are relevant to the IGF-1 receptor and theEGF receptor. The first type of β-finger is characterised by structuralconservation of the C-terminal part of the module and also of the linkerregion after the module. The signature sequence is C . . . CXXC wherethe third cysteine residue is the start of another β-finger module. Thesecond type of β-finger is characterised by structural conservation ofthe N-terminal portion of the module and also of the linker region afterthe module. The signature sequence is C . . . CXXXC where the thirdcysteine is the start of a module whose disulfide bonding pattern has aCys 1–Cys 3, Cys 2–Cys 4 arrangement.

Comparison of the sequences of the modules of the IGF-1 receptor CRDwith the sequence of the EGF receptor S2 domains suggested that thearrangement of modules in the S2 domain were different from those of theIGF receptor CRD and the EGF receptor S1 domain. The residues of thethird module in the CRD of the IGF-1 receptor, Cys 201–Cys 218, could bealigned with the segments of the EGF receptor S2 domain sequence: Cys482–Cys 499; Cys 534–Cys 555 and Cys 596–Cys 612. These modules are theputative first, fourth and seventh modules of the S2 domain. Theresidues of the first EGF receptor module were also aligned to residuesCys 152–Cys 181 of the first module of the IGF-1 receptor CRD. Theresidues of the fourth module in the CRD of the IGF-1 receptor, Cys221–Cys 230, a beta-finger module of the first type described above,could be aligned with the regions of sequence Cys 502–Cys 511 and Cys558–Cys 567. These two regions of the EGF receptor S2 domain are theputative second and fifth modules. By elimination, the regions betweenthe two sets of remaining cysteine residues (the putative third andsixth modules) were assigned as β-finger modules of the second type.These regions of sequence are followed by three residues and then amodule containing four cysteine residues. The N-terminal regions of thefifth (Cys 234–Cys 246) and seventh modules (Cys 277–Cys 291) of theIGF-1 receptor CRD were both aligned to the N-terminal regions of thetwo modules (Cys 515–Cys 531 and Cys 571–Cys 593).

In the IGF-1 receptor CRD, there is no occurrence of a β-finger modulebeing followed by a module containing four cysteine residues. Thus, thepositioning of the fourth module in the EGF receptor S2 model relativeto the third module is essentially arbitrary. The same applies to thepositioning of the seventh module relative to the sixth module of theEGF receptor S2 domain model.

Construction of the Model

Version 3 of the MODELLER program (Modeler User Guide, October 1996, SanDiego Molecular Simulations Inc) was used to build models of the EGFreceptor. The various sequences of the IGF-1 receptor and the EGFreceptor shown in FIG. 1 were used as the alignment for the constructionof the model of the L1 and S1 domains of the EGF receptor. Thecoordinates of each of the IGF-1 receptor sequences (i.e. the templates)shown in FIG. 1 were also used as input for the MODELLER program.Additional distance restraints were generated between Cα atoms ofselected residues. The restraints were generated as follows. The smallIGF-1 receptor templates were superimposed into the structure of thefirst two domains of the IGF-1 receptor using the Cα atoms of theresidues which are aligned in FIG. 1. Using the Homology module of theInsight II program (Homology User Guide, October 1995, San DiegoBIOSYM/MSI) coordinates were built for the EGF receptor residues whichare aligned to the IGF-1 receptor coordinates which are in boldtypeface. From these coordinates, distance restraints in the form ofGaussian curves were constructed for pairs of Cα atoms with a distanceless than 50 Å. The sigma value of the Gaussian curves was set to be 2Å. A MODELLER run was submitted using the alignment in FIG. 1. The builtmodels of proteins attempt to satisfy these restraints in addition tothe restraints the program derives from the alignment.

The aligned IGF-1 receptor and EGF receptor sequences of FIG. 2 wereused as the alignment for creating the model of the L2 and S2 domains ofthe EGF receptor. The coordinates of the each of the IGF-1 receptorsequences shown in FIG. 2 were used as the structural templates. Twoseparate sets of additional restraints were used. The first set werebased on the underlined IGF-1 receptor residues which are aligned to EGFreceptor residues Cys 482–Cys 534 (the first module of the S2 domain tothe first cysteine of the fourth module). From the coordinates of the Cαatoms of these residues, distance restraints in the form of Gaussiancurves were constructed for pairs of Cα atoms with a distance less than50 Å. The second set of additional restraints were based on the Cα atomsof the underlined IGF-1 residues which are aligned to EGF receptorresidues Cys 534–Cys 596 (the fourth module of the S2 domain to thefirst cysteine of the seventh module). The signal value of the Gaussiancurve used to construct the additional restraints was 1 Å.

For both sets of models, the MODELLER program constructed 20 modelswhose coordinates were perturbed from an initial structure by a randomvalue of maximum distance 4 Å. The refinement level used was the‘refine1’ option in the MODELLER program.

Most of the insertion regions of the EGF receptor models wereconstructed using the “loop” routine of version 4 of MODELLER (ModelerUser Guide, June 1997, San Diego Molecular Simulations Inc). Coordinatesfor each insertion were built using one of the two models obtained inthe previous section as a scaffold. The regions of sequence for whichcoordinates were built in this manner were 1–5, 8–12, 16–23, 46–51,101–107, 145–148, 184–191, 241–262, 319–328, 522–530, 540–546, 578–600and 612–621. Coordinates for residues 351–368 and 387–393 wereconstructed simultaneously due to the proximity of these regions in themodel of the L2 domain. For each insertion, 50 models were constructed.In cases where the generated loops with the lowest scores had similarbackbone structures, the loop building process was considered to haveconverged and the coordinates of the loop replaced those of the sameresidues on the refined model. Where the loop structures did notconverge, the structures with the three lowest MODELLER loop scores wereevaluated using Procheck (Laskowski R A, MacArthur M W, Moss D S,Thornton J M. (1993). J Appl. Crystallogr 26: 283–291), ProsaII(Hendlich M, Lackner P, Weitckus S, Floeckner H. Froschauer R,Gottsbacher K, Casari G, Sippl M J. (1990) J Mol Biol 216:167–180; SipplM J. (1993) Proteins 17: 355–362.) and Profiles-3D (Bowie J U, Lüthy R,Eisenberg D. (1991) Science 253:164–170; Lüthy R, Bowie J U, EisenbergD. 1992. Nature 356:83–85.). For several of these loops, the one withthe second lowest MODELLER score was selected as it had a more favorableProfiles3D and ProsaII plot.

In order to retain certain secondary structures, additional restraintswere used in the construction of some of the loops. Restraints with theform of a right-handed half-Gaussian function with a s value of 0.05 Åwere used to hold selected mainchain N-O distances to 3.0 Å or less. Theatom pairs for which this additional restraint was added were: Gln139.N–Gln 184.OE1, Val 268.N–Tyr 261.O, Val 268.O–Tyr 261.N, Ser506.N–Ser 529, Ile 562.N–His 591.O and Glu 578.N–Val 592.

Structure of the EGF Receptor Model

The structure of the L1 and S1 domains of the EGF receptor as determinedby the modelling described above is shown in FIG. 3, while the structureof the L2 and S2 domains is shown in FIG. 4. The superposition of thesetwo models onto the structure of the extracellular domains of the IGF-1receptor is shown in FIG. 5.

The coordinates of the EGF receptor domains L1, S1, L2 and S2 are shownin FIG. 6.

FIGS. 7, 8 and 9 show the sidechains of residues of the EGF receptormodels which face the large cavity as shown in FIG. 5. FIG. 10 shows thesidechains of residues of the face of the EGF receptor L2 domain whichcontains the second beta sheet (the lower face of the L2 domain usingthe orientation shown in FIG. 4).

The structures of the L1 and S1 domains are similar to those of theIGF-1 receptor structure, as expected. There are three major differencesin the S1 domain of the EGF receptor model from the structure of theIGF-1 receptor cysteine-rich domain. The first module of the S1 domainis noticeably smaller than that of the IGF-1 receptor CRD. The sixthmodule (Cys 271–Cys 283) of the S1 domain is smaller than that of theIGF-1 receptor and occupies less of the region between the L1 and L2domains. The fifth module (Cys 240–Cys 267) contains a large insertionwhich points away from the L1 domain. The eighth module of the EGFreceptor S1 domain (Cys 305–Cys 309) and the linker region (Arg 310–Val312) which follows it are similar in structure to the analogous regionsof the IGF-1 receptor. Like the IGF-1 receptor, the linker region ispostulated to be a hinge region about which the S1 domain and the L2domain can reorient.

A region of the EGF receptor in the L2 domain which could not be alignedwith the IGF-1 receptor includes the residues Trp 386–Pro 387 which areconserved across the EGF receptor family. This sequence motif is notfound in the insulin receptor family and may represent a region of novelstructure.

The amino acids 352–367 correspond to a large insertion in the L2 domainof the EGF receptor. The amino acids 351–364 have been identified as theepitope for several antibodies against the EGF receptor (Wu, D. G.,Wang, L. H., Sato, G. H., West, K. A., Harris, W. R., Crabb, J. W.,Sato, J. D., 1989, J. Biol. Chem. 264:17469–17475). This region forms aloop which sticks out of the surface is consistent with this regionbeing accessible to antibodies. The structure itself is difficult tomodel accurately since its sequence does not correspond to any part ofthe IGF-1 receptor sequence. The position of this insertion is inapproximately the same region as where the IGF 1 receptor differs inbackbone structure.

The S2 domain model of the EGF receptor adopts a different arrangementof modules and consequently a different shape that of the CRD of theIGF-1 receptor and the S1 domain model of the EGF receptor. Thedisulfide bond arrangement is the same as that predicted by similarityto the tumour necrosis receptor (Ward, C. W., Hoyne, P. A., Flegg, R.H., 1995, Proteins 22:141–153) and has since been confirmed by massspectroscopic analyses of proteolytically digested EGF receptorextracellular domain (Abe, Y., Odaka, M., Inagaki, F., Lax, I.,Schlessinger, J., Kohda, D., 1998, J. Biol. Chem. 273:11150–11157). Theonly significant contact of the S2 domain with the L2 domain of the EGFreceptor model is the intercalation of Trp 492 into the L2 domain,analogous to that made by Trp 176 in the S1 domain of the EGF receptorand Trp 176 in the CRD of the IGF-1 receptor to their respective L1domains. Unlike the S1 domain of the EGF receptor, the rest of the S2domain does not make any contacts with the L2 domain. The S2 domain isrod-like and points out from the L2 domain with a different geometry tothe manner in which the S1 domain points out from the L1 domain.

Putative Binding Sites of the EGF Receptor

From the IGF-1 receptor structure and a number of insulin receptormutants, one of the regions of insulin binding was proposed to the faceof the L1 domain which contains the second β-sheet (Garrett, T. P.,McKern, N. M., Lou, M., Frenkel, M. J., Bentley, J. D., Lovrecz, G. O.,Elleman, T. C., Cosgrove, L. J., Ward, C. W., 1998 Nature 394:395–399).This surface is characterised by a number of hydrophobic residues whichpoint out of the structure and also the presence of a structurallyconserved loop. By analogy, we propose that the analogous β sheets ofthe L1 and L2 are potential binding sites. These sheets contain a numberof hydrophobic residues, conserved amongst EGF receptor family members,which point away from the core of the β-helix structure. Residue 45 of amutant EGF has been cross-linked to the residue Lysine 465 which is inthe last strand of the lower β sheet of the L2 domain. (Summerfield, A Eet al, J Biol Chem, 1996, 271(33), 19656–19659). Tyrosine 101 has beencross-linked to the N-terminus of EGF (Woltjer, R L et al, PNAS, 1992,89(16), 7801–7805). This residue is in the portion of sequence whichimmediately follows a strand in the lower β sheet of L1.

The side chain of asparagine 1 of EGF has been cross-linked to lysine336 of the EGF receptor (Wu, D G et al, PNAS, 1990, 87(8), 3151–3155).The latter residue is in the N-terminal helix of the L2 domain andpoints towards the cavity which is formed when the two halves of the EGFreceptor are postioned in a similar arrangement to the first threedomains of the IGF-1 receptor. Two nearby residues, Asn 328 and Asn 337are glycosylated. This mutation is in a similar position to the insulinreceptor mutant S323L which has aberrent insulin binding.

Several insertional mutants of the EGF receptor extracellular domainwere constructed to probe the role of several regions of the receptor(Harte, M. T., Gentry, L. E., 1995, Arch Biochem Biophys 322:378–389). Anumber of these mutants were not detectably secreted by the cellsproducing them, suggesting that they did not fold to form stableproteins. Most of these insertions were in positions in the modelstructure where they would be unable to tolerate an insertion. Incontrast, most of the other insertions were in loops or other positionswhich, according to the model, are able to tolerate insertions. EGFreceptor extracellular domain mutants with insertions at residues 162,169, 174 and 220 bound EGF with a similar affinity to the wild-type EGFreceptor extracellular domain but bound TGF-α with a lower affinity. Thefirst of these insertions was located one residue before the lastcysteine residue of the L1 domain. The second and third insertions werepresent in the first module of the EGF receptor S1 domain and the fourthwas present in the third module of the S1 domain. All of these positionsare on a side of the molecule far removed from the large cavity as shownin FIG. 5. EGF receptor mutants with insertions at positions 251 (in thefifth module of the S1 domain) and 575 (in the sixth module of the S2domain) appeared to bind twice as much ligand as the wild-type receptor.Two insertional mutants which showed reduced EGF receptor bindingcontained insertions at positions 291 (in the seventh module of the S1domain) and 474 (one residue before the last cysteine of the L2 domain).

Another EGF receptor mutant which shows altered ligand binding behaviouris the R497K mutant. The site of this mutation in the first module ofthe S2 domain and faces the side of the L2 domain opposite to thatcontaining residue 465. This mutant binds EGF in a similar fashion aswild-type receptor but abolishes the high affinity binding site forTGF-α (Moriai, T., Kobrin, M. S., Hope, C., Speck, L., Korc, M., 1994,Proc Natl Acad Sci USA 91:10217–10221).

On the faces containing the second β-sheet (the lower face according tothe orientations shown in FIGS. 3 and 4) of the L1 and L2 domains are anumber of solvent-exposed hydrophobic residues including Tyr 64, Leu 66,Tyr 89, Tyr 93, Leu 348, Phe 380 and Phe 412. According to a survey ofprotein-protein interfaces, tyrosine, phenylalanine and leucine are morelikely to be involved in an interface than on the exterior of a proteincomplex (Tsai C-J, Lin S L, Wolfson, H J, Nussinov R (1997) Protein Sci6: 53–64). Lys 465 is located on the lower face of the L2 domain and Tyr101 is proximal to the lower face fo the L1 domain and are consistentwith the lower faces of the domains having roles in ligand binding.

Strategies for Developing EGF Receptor Ligands

For several signalling systems, ligand analogues which have antagonistproperties have been described. These ligand include the human growthhormone (Chen W Y, Chen N Y, Yun J, Wagner TE, Kopchick J J (1994) JBiol Chem 269:15892–15897), interleukin-6 (Savino R, Lahm A, Salvati AL, Ciapponi L, Sporeno E, Altamura S, Paonessa G, Toniatti C, CilibertoG EMBO J. 1994 Mar. 15;13(6):1357–67) and interleukin-4 (Kruse N, Tony HP, Sebald W (1992) EMBO J 11:3237–3244; Zurawski S M, Vega F Jr, HuygheB, Zurawski G (1993) EMBO J 12:2663–2670). The function of theseunmodified ligands is to bind their receptors and then subsequentlyrecruit a second receptor molecule. The mutations of the ligandsmentioned above are in positions which interfere with the binding of thesecond receptor (de Vos A M; Ultsch M, Kossiakoff A A (1992) Science255:306–312; Brakenhoff J P, de Hon F D, Fontaine V, ten Boekel E,Schooltink H, Rose-John S. Heinrich P C, Content J, Aarden L A (1994) JBiol Chem 269:86–93; Davis I D, Treutlein H R, Friedrich K, Burgess A W(1995) Growth Factors 12:69–83).

To date, no analogues of EGF receptor ligands have been found which arepurely antagonistic. Whether EGF and its homologues have sites ofbinding for two receptor molecules, like the proteins described above,has not been shown. Analysis of 1H NMR transferred nuclear Overhauserenhancement data for titration of TGF-α with the extracellular domain ofthe EGF receptor indicates that most parts of the ligand are in contactwith the receptor upon binding (McInnes C, Hoyt D W, Harkins R N, PagilaR N, Debanne M T, O'Connor-McCourt M, Sykes B D (1996) J Biol Chem271:32204–32211). However, the concentrations used in the experimentwere such that the dominant receptor species was the ligand-receptorcomplex with 2:2 stiochiometry. However, even if the ligands of the EGFreceptor are buried in the cleft formed by the first three domains ofthe receptor, it is difficult to envisage that such binding will lead tocontact with most of the bound ligand when only one receptor binds theligand. In an alternative scheme, at least two separate faces on EGF arerequired to bind into the large cleft of a single EGF receptor moleculewhich enacts a conformational change in the receptor which then allowsit to dimerise. An antagonist may bind to the first binding site of thereceptor and not the second, thus preventing dimerisation and subsequentsignalling of the receptor. Thus, delineation of the parts of the ligandinvolved in the (putative) primary and secondary binding faces wouldgreatly assist antagonist design.

Using the EGF receptor model and the known structures of EGF receptorligands, it may be possible to construct a model, or a partial model, ofligand binding which could suggest which parts of bound ligand areinvolved in binding to the first and second EGF receptors of theligand-receptor complex. There are several computer programs that canassist with the construction of such models. Programs such as Quilt(Lijnzaad P, Argos P (1997) Proteins 28:333–343; Lijnzaad P, Berendsen HJ, Argos P (1996) Proteins 26:192–203; Lijnzaad P, Berendsen H J, ArgosP 1996 Proteins 25:389–397) can be used to suggest sites on proteinsinvolved in interactions with other proteins. Possible structures ofprotein complexes can be obtained by programs such as FT-DOCK (Gabb H A,Jackson R M, Sternberg M J (1997) J Mol Biol 272:106–120) and GRAM(Vakser I A (1996) Biopolymers 39:455–464; Katchalski-Katzir E, SharivI, Eisenstein M, Friesem A A, Aflalo C, Vakser I A (1992) Proc Natl AcadSci USA 89:2195–2199). The calculation of electrostatic potentials fromthe Poisson-Boltzmann equation has been used to investigate complexesmade up of cytokines and growth factors and their receptors (Demchuk E,Mueller T, Oschkinat H, Sebald W, Wade R C (1994) Protein Sci 3:920–935)and may guide the construction of model complexes. The construction ofmodels will suggest regions of the EGF receptor ligands which may beinvolved in receptor binding. With the model and supporting experiments,it is envisaged that mutants of EGF and TGF-α will be constructed whichare potential antagonists.

The majority of targets for drugs which have made use of structuralinformation are enzymes. One advantage of enzymes over other types ofproteins is the presence of substrate-binding clefts whose normalfunction is to bind small molecule substrates or short lengths ofpeptides. In contrast, few small molecule inhibitors have been developedwhich inhibit protein-protein interactions.

Desolvation of protein surfaces appears to be an important factor in theformation of a protein-protein complex. Since, unlike thesubstrate-binding clefts of enzymes, protein-binding surfaces tend to bemuch less concave, a bound small molecule is unlikely to provide enoughdesolvation to enable tight binding. The lower surfaces of the L1 and L2domains, which have been suggested to be involved in ligand binding,contain hydrophobic regions which suggest that they need to be buriedfor strong binding of a molecule to these surfaces to occur. We envisagethat cyclic molecules, including cyclic peptides, may be able to bind tosuch surfaces. Hydrophobic functional groups may be chosen which, whenbound to the hydrophobic regions of the relevent face, desolvate regionsof the protein. Some of the functional groups which interact with theprotein will be polar or charged to make favourable electrostaticinteractions. Other parts in the cyclic molecule may be polar or chargedto increase the aqueous solubility of the molecule. Cyclic moleculesalso have the advantages of having few possible conformations whenunbound, providing a lower loss of entropy upon binding and thus greaterbinding as compared to a non-cyclic analogue. A degree of flexibilitywould exist and would allow the molecule to alter its conformation tobetter accommodate the protein it is binding to.

Algorithms such as LUDI (Bohm H J (1992) J Comput Aided Mol Des, 6:593–606) can be used to search for functional groups and molecularmoieties which may interact with a surface of the EGF receptor model.Algorithms such as CLIX (Lawrence M C, Davis P C (1992) Proteins12:31–41) or DOCK (Kuntz I D, Blaney J M, Oatley S J, Langridge R,Ferrin T E (1982) J Mol Biol 161:269–88) can be used to search adatabase of molecular structures for those which have shape and/orchemical complementarity to the EGF receptor. Computationalcombinatorial design algorithms (Miranker A, Karplus M Proteins (1991)11:29–34; Eisen M B, Wiley D C, Karplus M, Hubbard R E; Caflisch A(1994) Proteins 19:199–221) can also be tried. In one instance, acombinatorial approach has been used to design peptides to inhibit theinteraction of the proteins Ras and Raf (Zeng, J, et al, ProteinEngineering, to be published).

We envisage that as an alternative to a cyclic molecule, a small proteincould be used as a scaffold for placing amino acids that will interactwith the EGF receptor. At least one small protein (potatocarboxypeptidase inhibitor) with a fold different to that of EGFreceptor ligands has been identified which is a weak EGF antagonist(Blanco-Aparicio C, Molina M A. Fernandez-Salas E, Frazier M L, Mas J M,Querol E, Aviles F X, de Llorens R (1998) J Biol Chem 273:12370–12377).The use of a structural scaffold for proteins with diverse functions hasbeen observed in Nature (Lin S L, Nussinov R 1995 Nat Struct Biol2:835–837). Other molecular scaffolds such as dendrimers may also beconsidered which can be used to present the functional groups which willtightly interact with the EGF receptor.

At least two, non-exclusive modes of action can be envisaged. The firstmode involves a molecule competing for binding sites with one of the EGFreceptor's natural ligands. Most likely, the molecule will prevent thereceptor dimerisation which is required for activation of the EGFreceptor, thus acting as an antagonist. We do not rule out thepossiblity that the binding may be activating and the molecule acts asan agonist. The second potential mode of action is for the molecule tobind to a site on the EGF receptor which is not necessarily a ligandbinding site. Such a molecule may be physically large enough to hinderphysical access of a second receptor to the receptor which binds themolecule in question. This would hinder dimerisation and subsequentactivation of the receptor. If the molecule is sufficiently “sticky”, itmay attract a second EGF receptor and induce dimerisation, therebyacting as an agonist rather than an antagonist.

It will be appreciated by persons skilled in the art that numerousvariations and/or modifications may be made to the invention as shown inthe specific embodiments without departing from the spirit or scope ofthe invention as broadly described. The present embodiments are,therefore, to be considered in all respects as illustrative and notrestrictive.

1. A method of identifying a compound which modulates binding of aligand to an EGF receptor comprising: (A) designing or screening for acompound which binds to the structure formed by amino acids 1–475 orformed by amino acids 313–621 of a receptor having the atomiccoordinates as shown in FIG. 6 for amino acids 1–621 of the EGFreceptor, where binding of the compound to the structure is favoredenergetically, and (B) testing the compound designed or screened for in(A) for its ability to modulate binding of the ligand to the EGFreceptor in vivo or in vitro, thereby identifying a compound thatmodulates binding to the EGF receptor.
 2. The method according to claim1, wherein the testing in step (B) is performed by a high-throughputassay.
 3. The method of claim 1, wherein the testing in step (B)comprises testing the compound for the ability to modulate EGF receptormediated cell proliferation.
 4. The method of claim 1, wherein step (A)involves designing or screening for a compound which binds to a β-sheetof the L1 domain within the structure formed by amino acids 1–475 of areceptor having the atomic co-ordinates as shown in FIG. 6 for aminoacids 1–621 of the EGF receptor.
 5. The method of claim 1, wherein step(A) involves designing or screening for a compound which binds to aβ-sheet of the L2 domain within the structure formed by amino acids1–475 or formed by amino acids 313–621 of a receptor having the atomicco-ordinates shown in FIG. 6 for amino acids 1–621 of the EGF receptor.6. The method of claim 1 in which the compound is identified from testcompounds in a database.
 7. The method of claim 1, wherein step (B)comprises testing the compound for its ability to increase signaltransduction by binding to the EGF receptor.
 8. The method of claim 1,wherein step (B) comprises testing the compound for its ability todecrease signal transduction by binding to the EGF receptor.
 9. Themethod of claim 1, wherein step (B) comprises testing the compound forits ability to inhibit or prevent the binding of a ligand to the EGFreceptor.
 10. A method of selecting a compound which binds to the EGFreceptor comprising: (A) designing or screening for a compound whichbinds to the structure formed by amino acids 1–475 or formed by aminoacids 313–621 of a receptor having the atomic coordinates as shown inFIG. 6 for amino acids 1–621 of the EGF receptor, where binding of thecompound to the structure is favored energetically, and (B) selecting acompound designed or screened for in (A) which has an experimentallydetermined K_(d) or K_(l) of less than 10⁻⁶M for the EGF receptor,thereby selecting a compound which binds to the EGF receptor.
 11. Themethod as claimed in claim 10, wherein K_(d) is less than 10⁻⁸M.
 12. Themethod of claim 10, wherein K_(l) is less than 10⁻⁸M.